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Abstract 

Two-state cooperativity is an important characteristic in protein 
folding. It is defined by a depletion of states lying energetically be- 
tween folded and unfolded conformations. While there are different 
ways to test for two-state cooperativity, most of them probe indirect 
proxies of this depletion. Yet, generalized-ensemble computer simula- 
tions allow to unambiguously identify this transition by a microcanon- 
ical analysis on the basis of the density of states. Here we perform a de- 
tailed characterization of several helical peptides using coarse-grained 
simulations. The level of resolution of the coarse-grained model allows 
to study realistic structures ranging from small a-helices to a de novo 
three-helix bundle — without biasing the force field toward the native 
state of the protein. Linking thermodynamic and structural features 
shows that while short a-helices exhibit two-state cooperativity, the 
type of transition changes for longer chain lengths because the chain 
forms multiple helix nucleation sites, stabilizing a significant popu- 
lation of intermediate states. The helix bundle exhibits the signs of 
two-state cooperativity owing to favorable helix-helix interactions, as 
predicted from theoretical models. The detailed analysis of secondary 
and tertiary structure formation fits well into the framework of sev- 
eral folding mechanisms and confirms features observed so far only in 
lattice models. 



1 Introduction 

Two-state protein folding is characterized by a single free-energy barrier be- 
tween folded and unfolded conformations at the transition temperature T^, 
whereas downhill folders do not exhibit folding barriers ^ JLj . The analysis 
of this property conveys important information on both the thermodynamics 
as well as the kinetic pathways of proteins [21 E]. A widely used test for 
a two-state transition is the calorimetric criterion, which probes features in 
the canonical specific heat curve |1] . However, this criterion neither provides 
a sufficient condition to identify two-state transitions [5], nor does it offer 
a clear distinction between weakly two-state and downhill folders. Other 
experimentally observable aspects of two-state cooperativity include sharp 
transitions in certain order parameters, or features in chevron plots [HI [3]. 
All these methods focus on thermodynamic consequences of a depletion of 
intermediate states; they don't study it directly. 



However, it is possible to determine the density of states in a standard 
canonical computer simulation at temperature T* of interest: sample the 
probability density p{E) of finding an energy E. The density of states Q{E) is 
then proportional to p{E) q^I^^'^* ^ and hence the entropy is (up to a constant) 
given by S{E) = ks In Q{E) = const. + fee lB.p{E) + E/T*. One may proceed 
to analyze the system microcanonically, i.e., to study the thermodynamics 
of S{E), in the neighborhood of {E)^*- The advantage is that we essentially 
directly analyze the probability density p{E) rather than merely looking at 
its lowest moments, such as the specific heat. Such a microcanonical analysis 
has been applied to a wide variety of problems, e.g., spin systems [TtlSllQlfTOt 
[m [121 [13] , nuclear fragmentation [I11[I5], colloids [12], gravitating systems 
[T71 [IB], off- lattice homo- and heteropolymer models [201 [IS], and protein 
folding [2ll|22l [231 [23 [251 [261 [27]. Two remarks are worthwhile: 

• If the transition is characterized by a substantial barrier, standard 
canonical sampling suffers from the usual getting-stuck-problem: Dur- 
ing a simulation the system might not sufficiently many times cross the 
barrier to equilibrate the two coexisting ensembles. This, of course, 
is true and needs to be avoided irrespective of whether one aims at a 
canonical or microcanonical analysis. Many ways around this problem 
have been proposed, e.g., multicanonical [28] or Wang-Landau [29] sam- 
pling. In our study we employ replica-exchange molecular dynamics for 
sampling coupled canonical ensembles [33j and combine the overlapping 
energy histograms by means of the weighted histogram analysis method 
(WHAM) [301 [311 [32] , a minimum variance estimator for Q{E). 

• Accurately sampling the whole distribution p{E) over some range of 
interest requires better statistics than merely sampling its lowest mo- 
ments: there's a price for higher quality data. But then, a micro- 
canonical analysis taps into this quality, while a canonical analysis 
of the much longer simulation run would not significantly improve 
the observables. Recall that the canonical partition function Z{T) = 
J dE il{E) exp~^^''^'^ is the Laplace transform of the density of states 
^{E), an operation well-known to be {i) strongly smoothing and thus 
(ii) hard to invert. 

From a thermodynamic point of view a two-state transition is characterized 
by two coexisting ensembles of conformations [6]. While this does not qualify 
as a genuine (first order) phase transition (because the free energy of finite 



systems is always analytical), its finite-size equivalent can be unambiguously 
characterized by monitoring the entropy S{E). In the phase-coexistence 
region it will exhibit a convex intruder due to the suppression of states of 
intermediate energy. This can best be observed by defining the quantity 
AS{E) = T-i^E) — S{E), where TiiE) corresponds to the (double-)tangent 
to S{E) in the transition region |S1 [HI 1231 Ell ID]- In a finite system the 
existence of a barrier in AS{E) will imply a non-zero microcanonical latent 
heat AQ, defined by the interval over which S{E) departs from its convex 
hull, and in turn leads to a "backbending" effect (akin to a van-der-Waals 
loop) in the inverse microcanonical temperature T~^{E) = dS/dE (e.g., 
[SI El [ini ESI [121 US])- A non-zero AQ demarcates a transition region, whereas 
a downhill folder (continuous transition) will only exhibit a transition point, 
where the concavity of S{E) is minimal. 

Extending a recent study [27], we focus here on the link between {i) the 
nature of the transition (i.e., two-state vs. downhill), {ii) secondary structure, 
and {Hi) tertiary structure formation for several helical peptides using a 
high-resolution, implicit-solvent coarse-grained model. The results will be 
interpreted in terms of different frameworks of folding mechanisms, such 
as the molten globule model and simple polymer collapse models [311 135] . 
While all helical peptides presented in this work are artificially constructed 
("de novd^), and have thus not naturally evolved, they exhibit the relevant 
physics in a particularly clean way and are in this sense useful model systems. 
(See Supporting material for a further discussion of this point.) 

2 Methods 

Coarse-grained (CG) Molecular Dynamics (MD) simulations were based on 
an intermediate resolution, implicit-solvent peptide model [36]. It accounts 
for amino acid specificity and is capable of representing genuine secondary 
structure without explicitly biasing the force field toward any particular con- 
formation (native or not). Tabled] lists the sequences of all studied peptides. 
More details can be found in the Supporting Material. 

Replica-exchange MD simulations were performed using the ESPResSo 
package [37]. All simulations were run in the canonical {NVT) ensemble 
using a Langevin thermostat with friction constant F = r~^, where r is the 
intrinsic unit of time of the CG model. The CG unit of energy, £, relates to 
thermal energy at room temperature via £ = /i;B^room = 1-38 x 10~^^ J K~^ x 



300 K ^ 0.6kcalmol^^. The temperature T was expressed in terms of the 
intrinsic unit of energy T = S/k^. The equations of motion were integrated 
with a time step 5t = 0.01 r. 

Entropy, order parameters, and canonical averages were obtained from the 
density of states, n{E), which itself was calculated from WHAM [301 [3T| [32] . 
Details can again be found in the Supporting Material. 

Finally, the reader should observe that CG force fields — including the 
one used here — are usually constructed to reproduce the canonical ensemble, 
hence they strive to reproduce the free energy. However, individual enthalpic 
and entropic contributions will generally be off, because the reduced number 
of degrees of freedom lowers the entropy of CG conformations, and so the 
energies need to be adjusted to leave the free energy correct. For instance, in 
the absence of solvent both solvent energy and entropy must be parametrized 
into effective solute interaction energies. The entropies we calculate in this 
work are thus not to be confused with the entropies of the actual system. 
On the other hand, this does of course not deprive them of being exquisitely 
sensitive observables for the thermodynamics of the CG model. 

3 Results 

3.1 Secondary structure 

We first examine the structural and energetic properties of the sequence 
(AAQAA)„ with various chain lengths n = 3, 7, 10, 15. The n = 3 variant is 
known as a stable a-helix folder and has been studied both experimentally 
and computationally jlOl SH Il2l SSI SI] . The n = 7 peptide has also been 
shown to fold into a helix |12]. We find that all four peptides form a stable 
long helix in the lowest energy sector (see below), but are not aware of 
any structural study that would confirm this for the longer peptides with 
n = 10, 15. Since we will soon show that the latter two fold differently 
from the shorter ones, an experimental confirmation of their ground state 
structure would be very useful. 

For (AAQAA)3 Fig. [1^ shows a barrier in AS{E) as well as a back- 
bending in the inverse microcanonical temperature T~^{E), indicative of a 
first-order like transition. The two vertical lines mark the transition region 
with the corresponding microcanonical latent heat AQ. In the region be- 
tween E = (40 — 80) S mostly-helical and mostly-coil conformations coexist. 



in agreement with the sharp transitions in the hehcity 0{E) (as determined 
by the stride algorithm [52]) and the number of hehces in the chain, H{E). 
These resuhs point to a clear two-state folder. 

Increasing the chain length from ra = 3 to n = 15 (Figures [1] b, c, d) 
changes the nature of the transition significantly. While n = 7 still shows 
a (lower) barrier in AS{E) and a non-zero microcanonical latent heat AQ, 
the cases n = 10 and n = 15 are downhill folders (no barrier in AS{E) and 
monotonic T~^{E) curves). The transition region is replaced by a transi- 
tion point for which the concavity of S{E) is minimal and AQ = 0. This 
process is associated with important structural changes around the transi- 
tion region/point as seen in the number of helices H{E): while the curve is 
monotonic for n = 3, it exhibits a peak with H{E) > 1 for bigger n, show- 
ing that during the transition most conformations form more than one helix. 
This suggests the existence of multiple helix nucleation sites upon folding 
(see representative conformations at the transition point for n = 10, 15 in 
Fig. [ID . 

In order to further elucidate the structural features of these chains around 
the transition region/point the fraction of secondary structure (i.e., helicity) 
was analyzed in dependence of both energy and residue index for helices 
n = 3, 15. While for n = 3 helix nucleation appears mostly around the 
center of the peptide and propagates symmetrically to the termini (Fig. [2^), 
n = 15 shows two distinct peaks at an energy E slightly below the transition 
point (Fig. [2)d). The results suggest the formation of two individual helices 
placed symmetrically from the midpoint of the chain — around residue 35 — 
which only join into one long helix significantly below the transition point. 
As will be discussed in Section [D, these two helices divide the system into two 
distinct melting domains which fold non-cooperatively (i.e., folding one helix 
does not help folding the other) [321 HZ] • The same conclusion can be drawn 
from the probability distributions of forming an ?Ti-helix (see Supporting 
Material). 

To probe the behavior of simultaneous folding motifs within a chain, we 
performed a microcanonical analysis of the 73 residue de novo three-helix 
bundle tt3D [5B] (amino acid sequence given in Table [T]). The CG model 
used here has been shown to fold a3D with the correct native structure, 
up to a root-mean-square-deviation of 4 A from the NMR structure [36] . 
While of similar length compared to (AAQAA)i5, it shows a discontinuous 
transition (see Fig. [3]) and thus a nonzero microcanonical latent heat during 
folding. In the transition region the helicity increases sharply from 20% 
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to about 65%, and the average number of helices also increases sharply - 
but monotonically - from 1.5 to 3. Unlike for the simple n = 7, 10, 15 
helices, the transition region never samples more helix nucleation sites than 
the number of helices at low energies. As can be seen from the representative 
conformations shown in Fig. [3l the ensemble of folded states [E f» 130 S) 
consists of three partially formed helices in largely native chain topology; the 
coexisting unfolded ensemble [E ^ 225 S) consists of a compact structure 
containing transient helices. All these findings identify a3D as a two-state 
folder. 

To better monitor the formation of individual helices, we measured the 
fraction of helicity as a function of energy and residue, see Fig. m Unlike 
(AAQAA)„ (Figure [2]), a3D shows strong features due to its more interesting 
primary sequence. The turn regions (dark color) delimiting the three helices 
(light color) are clearly visible at low energies and correspond well to the 
STRIDE prediction of the NMR structure, as shown in Table [H Moreover, it 
is clear from this figure that secondary structure formation happens simul- 
taneously (i.e., at the same energy) for all three helices, and that most of 
the folding happens within the coexistence region (marked by the two ver- 
tical lines). The residues which form the native turn regions do not show 
any statistically significant signal of helix formation at any energy. Sec- 
ondary structure has almost entirely formed close to the folded ensemble in 
the transition region (left-most vertical line) — in line with the representative 
conformations shown in Fig. [21 

3.2 Tertiary structure 

A secondary structure analysis alone can only provide information on the 
local aspects of folding. Several studies have highlighted the role of an inter- 
play between local and non-local interactions in protein folding cooperativity 
(see e.g. Refs. [HI HHl [501 EZ])- Here we first analyze the size and shape 
of the overall molecule by monitoring, respectively, the radius of gyration 
-Rg = ^^^/J^^^r^^^^'+l^ and the normalized acylindricity c = (A^ + Ay)/2A^ as 
a function of E, expressed in terms of the three eigenvalues of the gyration 
tensor A^ < A^ < A^. The results for the single helices n = 3 and n = 15 and 
the three-helix bundle a3D are shown in Figure [51 (AAQAA)3 shows sharp 
features in both order parameters within the transition region, indicating an 
overall structural compaction (in shape and size) of the chain as energy is 
lowered. Observe that c approaches 0.13 at high energy, which is close to 



the random walk or self-avoiding walk values, both close to c ~ 0.15 [SH IS2]- 
The longer helix n = 15 shows a non-monotonic behavior in both Rg{E) and 
c{E): while the radius of gyration exhibits a minimum around E = 400 S, 
the normalized acylindricity displays a maximum. This indicates a structure 
that is most compact and spherical 100 S above the transition point. This 
dip in Rg{E) corresponds to a chain collapse into "maximally compact non- 
native states" [31] due to a non-specific compaction of the chain gradually 
restricted by steric clashes, at which point secondary structure becomes fa- 
vorable. Upon lowering the energy, the radius of gyration increases and the 
acylindricity decreases, because the peptide elongates while folding from a 
compact globule into an a-helix. Results for the three-helix bundle are sim- 
ilar: Rg{E) and c{E) also show a minimum and a maximum, respectively, 
slightly above the transition region. This indicates a similar type of chain 
collapse mechanism. However, non-monotonic features appear also at the 
other end of the transition region [E ^ 130 S) where the radius of gyration 
shows a maximum and the acylindricity plateaus. The evolution of the two 
order parameters below the transition region is rather limited, suggesting 
that only minor conformational changes take place (i.e., the shape of the 
molecule stays steady while its size decreases slightly). In contrast, at high 
energy both (AAQAA)i5 and a3D are still far away from a random walk 
limit, as evidenced by the acylindricity being far away from 0.15. 

Chain collapse in longer chains (such as (AAQAA)i5 and a3D) can readily 
be observed by monitoring tertiary contacts as a function of energy. Figure E] 
shows the total number of non-local contacts (red curve) as well as the num- 
ber of native contacts alone (blue curve). Tertiary contacts are defined here 
as pairs of residues that are more than five amino acids apart (this prevents 
chain connectivity artifacts) and within a 10 A distance (these numbers are 
somewhat arbitrary, but their value does not affect the qualitative behavior 
of Fig. [6]). Native contacts correspond here to the set of abovementioned 
tertiary contacts sampled with a frequency higher than 1% from a set of 
10,000 low-energy conformations {E < 50 £^). While the two curves are vir- 
tually identical below the transition region (i.e., all contacts are native) and 
of similar trend above it, they behave very differently inside that interval. 
Although the number of native contacts monotonically increases as the en- 
ergy is lowered (transition from globule to native-like structure), the total 
number of contacts shows a peak above the transition region and sharply 
decreases inside it. To approach the native state, the peptide needs to break 
more contacts of non-native type than it gains contacts which are native. 
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The non-monotonicity of this curve, as well as the -Rg data, invite a 
comparison with the thermodynamics of water: upon cooling, liquid water 
expands below 4° C Weak but isotropic van der Waals interactions are given 
up for strong but directional hydrogen bonds. This energy /entropy balance 
seems to occur in a very similar manner here, and essentially for the same rea- 
son. Weak van der Waals side-chain interactions (i.e., tertiary contacts) are 
replaced by hydrogen-bond interactions (i.e., secondary structure) at lower 
energies. This further confirms the concept of a chain collapse into maximally 
compact non-native states: upon lowering the energy (above the transition 
region) the system has accumulated a large number of non-native contacts 
due to a simple hydrophobicity-driven compaction mechanism. This idea was 
proposed early on as the "hydrophobic collapse model" or "molten globule 
model" |3H |35]. A similar effect was observed by Hills and Brooks using a 
Go model, where out-of-register contacts had to unfold in order to reach the 
native state [53] . 

While a transient chain collapse upon cooling is present in both (AAQAA)i5 
and a3D {Rg{E) is non-monotonic, see Fig. \5!p and c), its effect on tertiary 
structure formation will greatly depend on the amino acid sequence. Figure 
[7] shows the number of tertiary contacts of the two peptides as a function of 
energy and residue. The single helix n = 15 shows a uniformly small number 
of tertiary contacts in the low energy region (due to the linearity of the helix) 
and peaks above the transition point (which corresponds to the energy where 
Rg{E) is smallest). The tertiary contact distribution in the maximally com- 
pact non-native states is homogeneous along the chain (i.e., all residues have 
the same number of contacts). On the other hand, the number of tertiary 
contacts along the three-helix bundle (Fig. [Tjo) is highly structured, forming 
stripes as a function of residue that extend below the transition region. This 
follows directly from the amphipathic nature of the subhelices that consti- 
tute aSD: residues that form the native hydrophobic core of the bundle have 
a higher number of contacts. The presence of these stripes in the energetic 
region of collapsed structures {E ^ 300 £) is due to a strong selection be- 
tween hydrophobic and polar amino acids during the hydrophobic collapse, 
burying hydrophobic groups inside the globule. The low number of tertiary 
contacts in the turn regions indicates that they remain on the surface of the 
maximally compact globule during chain collapse. 



4 Discussion 

Two-state cooperativity has been characterized as a common signature of 
small proteins for which the transition of the cooperative domain corresponds 
to the whole molecule (i.e., the protein undergoes a transition as a whole) 
[5^ . While this framework applies well to the small helix (AAQAA)3, it 
is difficult to predict its thermodynamic signature from other grounds: a 
description of the conventional helix-coil transition is not appropriate due 
to the small size of the system and the correspondingly important finite-size 
effects. 

The thermodynamic signature of proteins can better be described for 
longer chains. Several arguments can be brought forward to explain the 
transition we observe for the longer helices (AAQAA)„ for n = 10, 15: 

1. Most theoretical models of the helix-coil transition, such as the Zimm- 
Bragg model [55], are based on the one- dimensional Ising model, which 
- being one-dimensional - shows no genuine phase transition but only 
a finite peak in the specific heat. The entropic gain of breaking a 
hydrogen-bond (i.e., forming two unaligned spins) outweighs the asso- 
ciated energetic cost for a sufficiently long chain. 

2. The structure of the maximally compact state right above the transition 
(see Fig. [7]) indicates that there is no statistically significant competi- 
tion between amino acids (i.e., all residues have the same number of 
tertiary contacts) and is therefore associated with a homopolymer-type 
of collapse, which is indeed barrierless |3H [56] . 



3. The denaturation of large proteins composed of several "melting" do- 
mains is not a two-state transition [171 116] . The presence of two helix 
nucleation sites around the transition point (Fig. [2]) indicates the exis- 
tence of two such melting domains that fold non-cooperatively: folding 
one helix is not correlated with the formation of the other. We have 
checked that there are no statistically significant helix-helix interactions 
between the two domains by calculating contact maps. These were av- 
eraged over the ensemble of conformations for which 50 < £" < 150 £ 
(data not shown). 

Common expectation is that bigger systems show sharper transition signals, 
and it might thus appear surprising that the transition of the (AAQAA)ji 
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sequence weakens for increasing n. However, one needs to bear two things 
in mind. First, size alone is not sufficient, dimensionality counts as well. 
In the Supporting Material we show examples of quasi-one-dimensional sys- 
tems for which transitions become weaker for bigger systems, because in the 
process of growing they become "more one-dimensional." When size is asso- 
ciated with cooperativity, one tends to think of globular (three-dimensional) 
systems, for which the size-cooperativity connection is true, but this is not 
the most general case. And second, the sharpness might depend on what 
observable one studies. The helicity 6* as a function of temperature indeed 
varies more sharply for larger n, making the response function {d6/dT)n peak 
more strongly for bigger n. While this steepening would suggest a stronger 
two-state nature, this goes against every other observable which suggests a 
downhill folder — including the calorimetric criterion (see below); observing 
response functions alone can thus be misleading. More details on this can be 
found in the Supporting Material. 

The two-state signature of the helix bundle aSD can be understood from 
two different perspectives: 

1. While there are clearly three distinct folding motifs (i.e., three he- 
lices), the selective hydrophobicity (i.e., amphipathic sequence) be- 
tween residues provides cooperativity: folding one helix helps the for- 
mation of the others. 

2. The barrier associated with a two-state transition is interpreted in the 
hydrophobic collapse model as the result of the cost of breaking hy- 
drophobic contacts from a maximally compact state into the folded 
ensemble |34j . A further discussion on the order of appearance of sec- 
ondary vs. tertiary structure formation can be found in the Supporting 
Material. 

Experimental studies of a3D showed a fast folding rate of (1—5) /xs and single- 
exponential kinetics [SZ], compatible with a two-state cooperative transition. 
As presented here, this highlights the interplay between secondary struc- 
ture formation (see Fig. S]) and the loss of non-native tertiary contacts (see 
Fig. [6]) — both occurring exactly within the coexistence region — as a possible 
mechanism for folding cooperativity |27] . 

Compaction of the unfolded state upon temperature increase has been 
observed experimentally by Nettels et al. using single-molecule FRET 
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While in our simulations the decrease in the radius of gyration can be ex- 
plained by a combination of the hydrophobic effect and the loss of helical 
structure, Nettels et al. showed similar behavior also for an intrinsically dis- 
ordered hydrophilic protein, where other mechanisms likely play a role. 

The present work avoided any reference to free energy barriers so far. 
While the nature of the finite-size transition can unambiguously be charac- 
terized from the presence of a convex intruder in the entropy S{E) [S], which 
implies a non-zero latent heat AQ, the mere existence of a free energy barrier 
is not a strong criterion because, first, the definition of a free energy barrier 
is not unique in a finite-size system [HI [7] and, second, the height of the 
barrier depends on the reaction coordinate used. Chan [52] therefore argued 
that the calorimetric criterion, which relates the van't Hoff and calorimetric 
energies, is often more restrictive on protein models than the existence of 
such a free energy barrier. Still, the density of states calculations performed 
here correlate well with calorimetric ratios for (AAQAA)„, n = {3, 7, 10, 15} 
and aSD: 6 = 0.78, 0.76, 0.51, 0.52 and 0.78, respectively. These were deter- 
mined by analyzing the canonical specific heat curve Cy{T) as in Kaya and 
Chan [4j (^2 without baseline subtraction). The value 5 = 0.78 for aSD also 
agrees with an earlier theoretical calculation of the similar bundle a3C from 
Ghosh and Dill [49^, who found 6 = 0.72. 

5 Conclusion 

Replica-exchange MD simulations of an intermediate resolution CG implicit- 
solvent peptide model allowed us to accurately determine the thermody- 
namics of folding for several helical peptides, without biasing the force field 
toward a particular native structure. We argued that a microcanonical anal- 
ysis is extremely valuable when characterizing the energetics and structure 
of peptides, for two reasons. First, an accurate density of states calcula- 
tion allowed the unambiguous characterization of the nature of the folding 
transition; and second, different order parameters, analyzed as a function of 
E, have exhibited highly non-monotonic behavior inside the (first-order-like) 
transition region. A corresponding canonical analysis (i.e., as a function of 
T) would not allow us to observe in such detail many of the abovementioned 
features around transition regions. 

The results showed that simply elongating the (AAQAA)„ sequence in- 
duced a change in the nature of the transition — from two-state (n = 3, 7) to 
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downhill {n = 10, 15). This correlated with the number of helices sampled 
around the transition region/point which is indicative of the average number 
of helix nucleation sites, thus characterizing the number of distinct melt- 
ing domains and the structural diversity of intermediates. Remarkably, the 
loss of a first-order signature still goes along with a potentially misleading 
steepening of the helicity as a function of temperature for longer chains (see 
Supporting Material). The bundle a3D was found to be two-state coopera- 
tive, in agreement with theoretical models ^9] 150] . The analysis of tertiary 
structure formation highlighted the influence of the amino acid sequence on 
the folding mechanism, using the hydrophobic collapse model as a starting 
point. 

While previous studies have brought forward the coupling between sec- 
ondary and tertiary structure formation for two-state cooperativity (e.g., 
[m SSI lini |27]), we illustrated here several links between the nature of the 
transition and secondary/tertiary structure signatures of folding for realis- 
tic representations of peptide chains. Reaching a thorough understanding of 
structure formation in two-state cooperative proteins will provide insight into 
the stability of their folded conformation. Cooperativity improves stability 
of the folded state by suppressing the population of intermediates. Muta- 
tions that lower cooperativity not only decrease stability, they have shown 
to promote misfolding in certain cases [60]. The resolution of the CG model 
provides a useful compromise between computational efficiency and resolu- 
tion in order to access features that were so far only observed in less realistic 
lattice models. 
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Peptide 






Sequence 


helix n = 3 






(AAQAA)3 


helix n = 7 






(AAQAA)7 


helix n = 10 






(AAQAA)io 


helix n = 15 






(AAQAA)i5 


bundle a3D 


MGSWA 


EFKQR 


LAAIK TRLQA LGGSE. . . 




AELAA 


FEKEI 


AAFES ELQAY KGKGN. . . 




PEVEA 


LRKEA 


AAIRD ELQAY RHN 



Table 1: Amino acid sequences of the peptides studied in this work. The 
three helical regions of the native state (from NMR structure, PDB 2A3D) 
of the helix bundle a3D [38] are underhned (as predicted by stride [39j). 
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List of Figure captions 

Figure 1 Various observables as a function of energy for (AAQAA)„: (a) 
n = 3, (b) n = 7, (c) n = 10, (d) n = 15. From top to bottom for 
each inset: AS{E), error bars reflect the variance of the data points 
(1 a interval); inverse temperatures from a canonical (T^l^{{Ecan) , blue) 
and a microcanonical (T^^ = dS/dE, red) analysis, where (-Ecan) is 
the canonical average energy; helicity 0{E) (red) and number of helices 
H{E) (blue), both with the error of the mean. Vertical lines mark 
either the transition region {n = 3, 7) or the transition point (n = 
10, 15). Representative conformations at different energies, visualized 
using VMD [13], are shown. 

Figure 2 Fraction of secondary structure as a function of energy and residue 
for (a) (AAQAA)3 and (b) (AAQAA)i5. Vertical lines mark the tran- 
sition region (a) and point (b), respectively. 

Figure 3 Various observables as a function of energy for a3D. Plots and 
definitions agree with the conventions in Fig. [H 

Figure 4 Fraction of secondary structure as a function of energy and residue 
for a;3D. Vertical lines mark the transition region. 

Figure 5 Radius of gyration Rg{E) (red) and normalized acylindricity pa- 
rameter c{E) (blue), both with the error of the mean, for (a) (AAQAA)3, 
(b) (AAQAA)i5, and (c) a3D. Vertical lines mark either the transition 
region (n = 3, aSD) or the transition point (n = 15). 

Figure 6 Number of tertiary contacts for a;3D as a function of energy. The 
"All contacts" curve (red) averages over all non-local pairs whereas 
the "Native only" curve (blue) only counts native pairs (see text for 
details). Vertical lines mark the transition region. 

Figure 7 Number of tertiary contacts as a function of energy and residue for 
(a) (AAQAA)i5 and (b) a3D. Observe that the dynamic range of (b) 
is four times as wide as that for (a). Vertical lines mark the transition 
point (a) and region (b). 
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